Phonetic segmentation of singing voice using MIDI and parallel speech
نویسندگان
چکیده
When analyzing singing voice signal, it is required to know the boundaries of each phonetic unit in the singing voice samples. However, due to prolonged vowels in the singing voice, it is not easy to accurately align a singing voice with the phonetic sequence of its lyrics by conventional speech recognition approach. This paper proposes a solution for the phonetic annotation of the singing voice with the provision of a MIDI file and a parallel speech recording of the lyrics. The MIDI file consisting of notation and lyric information is used to locate lyrics in the singing voice. The recording of parallel speech data is used to generate a reference phonetic annotation by forced aligning it with lyrics with a speech recognizer. The singing voice is then aligned with the speech, which has phonetic annotation, and the phonetic boundaries are mapped to the singing voice. The result shows that we are able to get an accurate annotation of phonetic boundaries in singing voice.
منابع مشابه
Outlines of Burcas - A simple concatenation-based MIDI-to-singing voice synthesis system
The present paper outlines a simple system (yet to be completed) for concatenation-based singing synthesis in Swedish. The system, called Burcas, takes as input a MIDI file (possibly holding multiple parts) for melody and a text file for lyrics, and it produces standard audio files as output. For the digital signal processing, the MBROLA speech generator is employed. Burcas consists of an input...
متن کاملA Score-to-singing voice synthesis System for the Greek Language
In this paper, we examine the possibility of generating Greek singing voice with the MBROLA synthesizer, by making use of an already existing diphone database. Our goal is to implement a score-to-singing synthesis system, where the score is written in a score editor and saved in the midi format. However, MBROLA accepts phonetic files as input rather than midi ones and because of this, we constr...
متن کاملImprovements to a Sample-Concatenation Based Singing Voice Synthesizer
This paper describes recent improvements to our singing voice synthesizer based on concatenation and transformation of audio samples using spectral models. Improvements include firstly robust automation of previous singer database creation process, a lengthy and tedious task which involved recording scripts generation, studio sessions, audio editing, spectral analysis, and phonetic based segmen...
متن کامل1 Concatenation - based MIDI - to - Singing Voice Synthesis
In this paper, we propose a system for synthesizing the human singing voice and the musical subtleties that accompany it. The system, Lyricos, employs a concatenation-based text-to-speech method to synthesize arbitrary lyrics in a given language. Using information contained in a regular MIDI le, the system chooses units, represented as sinusoidal waveform model parameters, from an inventory of ...
متن کامل